A Survey of Statistical Approaches to Preserving Confidentiality of Contingency Table Entries

نویسندگان

  • Stephen E. Fienberg
  • Aleksandra B. Slavkovic
چکیده

In the statistical literature, there has been considerable development of methods of data releases for multivariate categorical data sets, where the releases come in the form of marginal and conditional tables corresponding to subsets of the categorical variables. In this chapter we provide an overview of this methodology and we relate it to the literature on the release of association rules which can be viewed as conditional tables. We illustrate this with two examples. A related problem, ”association rule hiding” is often independently studied in the database

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Disclosure Limitation with Released Marginals and Conditionals for Contingency Tables

The goal of statistical disclosure limitation is to develop methods and tools that while preserving confidentiality can provide access to useful statistical data, not just a few numbers. In this paper we consider releases from contingency tables in the form of marginal counts and observed conditional frequencies. We link data utility to log-linear models, and evaluation of disclosure risk to bo...

متن کامل

Bounds for Cell Entries in Two-Way Tables Given Conditional Relative Frequencies

In recent work on statistical methods for confidentiality and disclosure limitation, Dobra and Fienberg (2000, 2003) and Dobra (2002) have generalized Bonferroni-Fréchet-Hoeffding bounds for cell entries in k-way contingency tables given marginal totals. In this paper, we consider extensions of their approach focused on upper and lower bounds for cell entries given arbitrary sets of marginals a...

متن کامل

Partial Information Releases for Confidential Contingency Table Entries: Present and Future Research Efforts

Tabular data have been a staple product for disseminating information derived from the confidential microdata that fuel social science research and inform policy decisions. This paper outlines recent results on disclosure risk assessment associated with the release of high-dimensional contingency tables, and discusses some related research problems. The main focus is the partial information rel...

متن کامل

Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues

Dissemination of information derived from large contingency tables formed from confidential data is a major responsibility of statistical agencies. In this paper we present solutions to several computational and algorithmic problems that arise in the dissemination of cross-tabulations (marginal sub-tables) from a single underlying table. These include data structures that exploit sparsity to su...

متن کامل

Web Systems That Disseminate Information But Protect Confidential Data

Statistical agencies have longstanding concern over confidentiality of their data [14, 15]. But agencies must also report information to the public. This tension between confidentiality and dissemination of statistical information is heightened by the emergence of the World Wide Web as a means of communication. On the one hand, confidentiality is threatened by advances in information technology...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008